statistical modeling and machine learning
Road Map for Choosing Between Statistical Modeling and Machine Learning
When we raise money it's AI, when we hire it's machine learning, and when we do the work it's logistic regression. Machine learning (ML) may be distinguished from statistical models (SM) using any of three considerations: Uncertainty: SMs explicitly take uncertainty into account by specifying a probabilistic model for the data. Structural: SMs typically start by assuming additivity of predictor effects when specifying the model. Empirical: ML is more empirical including allowance for high-order interactions that are not pre-specified, whereas SMs have identified parameters of special interest. There is a growing number of hybrid methods combining characteristics of traditional SMs and ML, especially in the Bayesian world.
Tale of Two Cultures! Statistical Modeling and Machine Learning
Statistics emphasizes inference, whereas machine learning emphasizes prediction. There has been lot of debates on this topic and there are strong sentiments in both the camps. The intent of this article is not to add more fuel to the fire or to side with one camp or the other, but to share some high level views so that the budding data scientists get a clear perspective. First thing is you need data, whether lots of data or samples of data is not the interesting question. The interesting question is, what approaches you would take to analyze the data to solve a problem.
The difference between Statistical Modeling and Machine Learning, as I see it
The basic goal of Statistical Modeling is to answer the question, "Which probabilistic model could have generated the data I observed?" For example, if your data represent counts, such as the number of customers churned or cells divided, then a model from the Poisson family, or the Negative Binomial family, or a zero-inflated model might be appropriate. Once a statistical model has been chosen, the estimated model serves as the device for inquiries: testing hypotheses, creating predicted values, measures of confidence. The estimated model becomes the lens through which we interpret the data. We never claim that the selected model generated the data but view it as a reasonable approximation of the stochastic process on which confirmatory inference is based.